Gene set analysis for self-contained tests: complex null and specific alternative hypotheses

نویسندگان

  • Yasir Rahmatallah
  • Frank Emmert-Streib
  • Galina V. Glazko
چکیده

MOTIVATION The analysis of differentially expressed gene sets became a routine in the analyses of gene expression data. There is a multitude of tests available, ranging from aggregation tests that summarize gene-level statistics for a gene set to true multivariate tests, accounting for intergene correlations. Most of them detect complex departures from the null hypothesis but when the null hypothesis is rejected, the specific alternative leading to the rejection is not easily identifiable. RESULTS In this article we compare the power and Type I error rates of minimum-spanning tree (MST)-based non-parametric multivariate tests with several multivariate and aggregation tests, which are frequently used for pathway analyses. In our simulation study, we demonstrate that MST-based tests have power that is for many settings comparable with the power of conventional approaches, but outperform them in specific regions of the parameter space corresponding to biologically relevant configurations. Further, we find for simulated and for gene expression data that MST-based tests discriminate well against shift and scale alternatives. As a general result, we suggest a two-step practical analysis strategy that may increase the interpretability of experimental data: first, apply the most powerful multivariate test to find the subset of pathways for which the null hypothesis is rejected and second, apply MST-based tests to these pathways to select those that support specific alternative hypotheses. CONTACT [email protected] or [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unite and conquer: univariate and multivariate approaches for finding differentially expressed gene sets

MOTIVATION Recently, many univariate and several multivariate approaches have been suggested for testing differential expression of gene sets between different phenotypes. However, despite a wealth of literature studying their performance on simulated and real biological data, still there is a need to quantify their relative performance when they are testing different null hypotheses. RESULTS...

متن کامل

HighProbability determines which alternative hypotheses are sufficiently probable: Genomic applications include detection of differential gene expression

Many genomic experiments, notably microarray experiments seeking to detect differential gene expression, involve calculating a large number of p-values. This leads to the multiple testing problem: when the number of null hypotheses is large, the probability of accepting at least one false alternative hypothesis is often much greater than the significance level of the tests, which tends to misle...

متن کامل

A Biological Evaluation of Six Gene Set Analysis Methods for Identification of Differentially Expressed Pathways in Microarray Data

Gene-set analysis of microarray data evaluates biological pathways, or gene sets, for their differential expression by a phenotype of interest. In contrast to the analysis of individual genes, gene-set analysis utilizes existing biological knowledge of genes and their pathways in assessing differential expression. This paper evaluates the biological performance of five gene-set analysis methods...

متن کامل

Likelihood Ratio Tests for Dependent Data with Applications to Longitudinal and Functional Data Analysis

The paper introduces a general framework for testing hypotheses about the structure of the mean function of complex functional processes. Important particular cases of the proposed framework are: 1) testing the null hypotheses that the mean of a functional process is parametric against a general alternative modeled by penalized splines; and 2) testing the null hypothesis that the means of two p...

متن کامل

Comprehensive Comparison of Gene Set Analysis Tools

Gene set analysis has enhanced the microarray data analysis field with biological insights. The first introduced and widely used Over-representation analysis (ORA) method, has the limitation of the requirement of a predetermined differentially expressed genes list. To overcome this limitation, distribution based analysis (DBA) methods were developed with different analysis steps and null hypoth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 28 23  شماره 

صفحات  -

تاریخ انتشار 2012